API Monitoring
Learn the importance of API monitoring.
What is API monitoring?#
Modern applications often use a web of services to meet their functional and non-functional goals, and we need continuous, automated monitoring of this complex system to meet business service-level agreements (SLAs). Among other things, APIs between services provide an excellent vantage point to observe any anomalies on time. For large systems, catching issues early has a lot of value in terms of customer satisfaction and minimizing the operational cost of the service.
If an API has a slow response, we must first identify the problem before we can fix it. The identification of the problem is possible by continuous monitoring of the API. API monitoring is the process of analyzing overall API performance to identify problems that can impact developers and users of the API. It analyzes the availability and on-time response of all the connected resources to API calls.
Why is API monitoring required?#
APIs are the building blocks of most digital applications because their businesses' transactions and flow depend on them. It’s impossible to manually monitor services at all times. Continuous API monitoring enables us to check that everything is working properly around the clock and helps us identify the root of any problem. The following illustration shows the impact of successful API responses and failed API responses.
1 of 5
2 of 5
3 of 5
4 of 5
5 of 5
Let’s summarize the key benefits API monitoring offers to software developers:
The development team knows the actual issue and can solve it directly. For example, if RPM is low, the monitor lets us know how the CPU and RAM are performing, and a suitable solution can be applied directly.
Bugs are logged, and related alerts are generated mostly before customers report them.
Performance can be improved by analyzing the behavior of the API to meet the SLOs.
Businesses using third-party APIs can also use API monitoring to see if the service providers meet the minimum requirements agreed upon in the SLA.
How does API monitoring work?#
An HTTP request is made from a remote API monitor to the different endpoints of an API. The responses received from the API servers are analyzed by an API monitoring tool based on different metrics(key:The user-defined test cases/variables to produce an expected behavior.) like response time, expected data, the correctness of the overall response, and so on. A report is generated highlighting the metrics that are not within the expectations, and the report is overviewed by the development team to address the issues. A high-level view of how a remote monitoring service works is provided in the illustration below.
Remember: Monitoring and alarms are often based on error rates exceeding some threshold in a unit time.
API monitoring methods#
Once an API is ready to be monitored, businesses must decide which type of monitoring it will opt for. Depending upon the type of use case, API monitoring can be divided into two categories: synthetic and real user monitoring. Let's discuss each method below.
Synthetic API monitoring#
Synthetic API monitoring is an active way to test an API's performance and response time with scripts that simulate the path an end user will take to communicate with an API. The testing is performed by executing scripts from different geographical locations and around the clock to analyze the performance.
By using this type of monitoring, we ensure that the changes that are being made to the APIs are improving the performance and not hindering it. The use of scripts rather than human clients for performance testing ensures that any issues that arise are not due to the client side but rather are from the APIs. This allows developers to focus on improving the APIs. Synthetic API monitoring logs metrics and provides insight into how the API can be improved before these metrics impact the customer experience
However, there are some limitations to this type of API monitoring. The monitoring is based on scripts that may have different results in real-world conditions with real users.
Real user monitoring (RUM)#
Real user monitoring (RUM) is associated with users' experiences by analyzing their interaction with the APIs in a production environment. This passive way of monitoring analyzes page load events, HTTP requests and their responses, timeouts, and the events that can crash an application. Instead of executing scripts, RUM takes data from real users and their devices. These events define the user experiences to understand what and where things need to be fixed before numerous users are impacted. However, RUM has limitations, such as the resources and efforts required to perform such testing in a real-world environment with actual users.
Point to Ponder
Question
In RUM, how are real users involved in monitoring an API’s behavior?
RUM works by injecting some predefined codes into the application. The values of metrics (performance and user experience) are captured while real users are using the API. For browser-based applications, JavaScript codes are injected, which the monitor page loads, and then the XHR requests. Analyzing the collected data allows us to improve the APIs. The following illustration depicts how RUM sits inside an application and processes the information:
In the image above, RUM analyzes the XHR request and processes it by incrementing the availability metric if the response’s status code is 200 OK.
Note: Businesses can go for either of the monitoring methods, but the best practice is to use both because they analyze different environments (synthetic and real world), enabling the user to analyze performance much better.
Key API metrics to monitor#
A certain number of API metrics must be monitored before making the API publicly available. These metrics define the quality of an API either being developed or to be taken from a third party to integrate within a business. A few important metrics are given below.
Key API Metrics to Monitor
Metric | Description |
RPM |
|
Latency |
|
Availability |
|
TTFHW |
|
Failure rate/ error per minute |
|
CPU and memory usage |
|
Choosing an API monitoring tool#
API monitoring plays an important role in the successful execution of businesses. Depending on the size and type of application, proprietary or third-party API monitoring tools may be employed. There is a wide range of tools, from free to premium, with different functionalities or metrics to monitor. The tools vary from synthetic to RUM to a hybrid approach in terms of monitoring methods. The following parameters can help choose the right monitoring tool for their business:
The monitoring tool should monitor the availability, performance, and latency of an API.
It should offer both synthetic monitoring and RUM to keep an eye on a variety of metrics mentioned above.
It should have the ability to identify third-party APIs affecting the performance of an application.
A flexible user interface to analyze monitoring parameters.
The script reuse options should be available for testing in synthetic monitoring.
A data-sharing option should be available to share stats with the teams.
It should have flexibility in alerting the user so that the business is aware of any potential issues.
How do tech giants monitor? #
The tech giants like Google, Facebook (Meta), Amazon, Microsoft, and Spotify take testing and monitoring as a valuable metric for success. They opt for manual testing and monitoring of their products and services, but monitoring using tools has a significant presence in every tech giant to optimize their services (APIs). For manual monitoring, they use the following methods:
Unit testing is a basic way of monitoring the product or service during development.
Monitoring the services by internal testers, other than developers, who use the APIs and inspect every single possible scenario to meet the defined requirements.
Crowd testing is another way of monitoring the APIs. This is where a bunch of people use the product and provide feedback.
The concept of dogfooding(key:This involves having employees use the product in their daily work and provide feedback. Big companies have thousands of employees working in different parts of the world, and they provide better real-time feedback.) is applied to ensure the service is usable.
Some companies release a beta version of the product to a small number of real users before making it available to everyone.
Big tech companies use automatic monitoring tools (proprietary tools) and a few of the techniques mentioned above to ensure the quality of their services. See the monitoring chapters in our Grokking Modern System Design Interview for Engineers & Managers course for more details.
Summary#
Although API monitoring can’t ensure 100% failure control, it’s an effective tool to detect various failure events before it affects our system and clients. Therefore, it’s crucial for a system dependent on internal or external APIs to use API monitoring tools.
(True or False) API monitoring ensures 100% API failure control.
True
False
API monitors can’t detect flaws in business logic and therefore can’t catch all types of API failures.
Caching at Different Layers
Quiz on Important API Concepts - II